An Experimental Comparison of Pregel-like Graph Processing Systems
نویسندگان
چکیده
The introduction of Google’s Pregel generated much interest in the field of large-scale graph data processing, inspiring the development of Pregel-like systems such as Apache Giraph, GPS, Mizan, and GraphLab, all of which have appeared in the past two years. To gain an understanding of how Pregel-like systems perform, we conduct a study to experimentally compare Giraph, GPS, Mizan, and GraphLab on equal ground by considering graph and algorithm agnostic optimizations and by using several metrics. The systems are compared with four different algorithms (PageRank, single source shortest path, weakly connected components, and distributed minimum spanning tree) on up to 128 Amazon EC2 machines. We find that the system optimizations present in Giraph and GraphLab allow them to perform well. Our evaluation also shows Giraph 1.0.0’s considerable improvement since Giraph 0.1 and identifies areas of improvement for all systems.
منابع مشابه
From "Think Like a Vertex" to "Think Like a Graph"
To meet the challenge of processing rapidly growing graph and network data created by modern applications, a number of distributed graph processing systems have emerged, such as Pregel and GraphLab. All these systems divide input graphs into partitions, and employ a “think like a vertex” programming model to support iterative graph computation. This vertex-centric model is easy to program and h...
متن کاملLightweight Fault Tolerance in Large-Scale Distributed Graph Processing
The success of Google’s Pregel framework in distributed graph processing has inspired a surging interest in developing Pregel-like platforms featuring a user-friendly “think like a vertex” programming model. Existing Pregel-like systems support a fault tolerance mechanism called checkpointing, which periodically saves computation states as checkpoints to HDFS, so that when a failure happens, co...
متن کاملOptimizing Graph Algorithms on Pregel-like Systems
We study the problem of implementing graph algorithms efficiently on Pregel-like systems, which can be surprisingly challenging. Standard graph algorithms in this setting can incur unnecessary inefficiencies such as slow convergence or high communication or computation cost, typically due to structural properties of the input graphs such as large diameters or skew in component sizes. We describ...
متن کاملTech Report: Compiling GreenMarl into GPS
The massive size of the data in large graph processing requires distributed processing. However, conventional frameworks for distributed graph processing, such as Pregel, use programming models that are well-suited for scalability but inconvenient for programming graph algorithms. In this paper, we use Green-Marl, a Domain-Specific Language for graph analysis, to describe graph algorithms intui...
متن کاملA General-Purpose Query-Centric Framework for Querying Big Graphs
Pioneered by Google’s Pregel, many distributed systems have been developed for large-scale graph analytics. These systems employ a user-friendly “think like a vertex” programming model, and exhibit good scalability for tasks where the majority of graph vertices participate in computation. However, the design of these systems can seriously under-utilize the resources in a cluster for processing ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- PVLDB
دوره 7 شماره
صفحات -
تاریخ انتشار 2014